Dataset statistics
| Number of variables | 31 |
|---|---|
| Number of observations | 1000000 |
| Missing cells | 541771 |
| Missing cells (%) | 1.7% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 236.5 MiB |
| Average record size in memory | 248.0 B |
Variable types
| NUM | 17 |
|---|---|
| CAT | 10 |
| BOOL | 4 |
Reproduction
| Analysis started | 2020-07-13 08:50:30.526124 |
|---|---|
| Analysis finished | 2020-07-13 09:01:21.679568 |
| Duration | 10 minutes and 51.15 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
EVENTDATE has constant value "20190802" | Constant |
PROVIDER has constant value "1000" | Constant |
ORIGINATIONNETWORKID has constant value "1" | Constant |
MONTHID has constant value "201908" | Constant |
LOAD_ID has constant value "1" | Constant |
LOADDATE has constant value "03-AUG-19" | Constant |
CLASSIFICATION has a high cardinality: 1741 distinct values | High cardinality |
ROAMING_DETAILS has a high cardinality: 62 distinct values | High cardinality |
RELOADFACEVALUE is highly correlated with BALANCERELOAD | High correlation |
BALANCERELOAD is highly correlated with RELOADFACEVALUE | High correlation |
HOME_CELLID is highly correlated with CELL_ID | High correlation |
CELL_ID is highly correlated with HOME_CELLID | High correlation |
CELL_ID has 123307 (12.3%) missing values | Missing |
ROAMING_DETAILS has 108880 (10.9%) missing values | Missing |
PS_TYPE has 303324 (30.3%) missing values | Missing |
COSID is highly skewed (γ1 = 26.63298874) | Skewed |
SHORTCODEID is highly skewed (γ1 = 188.3301685) | Skewed |
TOT_CHARGED_AMT is highly skewed (γ1 = 47.05091688) | Skewed |
BALANCERELOAD is highly skewed (γ1 = 69.40334234) | Skewed |
NO_OF_EVENTS is highly skewed (γ1 = 428.6914845) | Skewed |
BONUS is highly skewed (γ1 = -100.6923622) | Skewed |
TOT_ROUNDED_VOL is highly skewed (γ1 = 38.09800078) | Skewed |
CELL_ID is highly skewed (γ1 = 211.7311456) | Skewed |
RELOADFACEVALUE is highly skewed (γ1 = 74.06165783) | Skewed |
HOME_CELLID is highly skewed (γ1 = 222.5416205) | Skewed |
ORIGINATING_COUNTRY_ID is highly skewed (γ1 = -121.6382015) | Skewed |
SHORTCODEID has 972545 (97.3%) zeros | Zeros |
TOT_CHARGED_AMT has 832314 (83.2%) zeros | Zeros |
BALANCERELOAD has 977344 (97.7%) zeros | Zeros |
TOT_ACTUAL_DURATION has 154746 (15.5%) zeros | Zeros |
BONUS has 997234 (99.7%) zeros | Zeros |
TOT_ROUNDED_VOL has 467895 (46.8%) zeros | Zeros |
RELOADFACEVALUE has 976275 (97.6%) zeros | Zeros |
DESTINATION_COUNTRY_ID has 750311 (75.0%) zeros | Zeros |
| Distinct count | 1 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| 20190802 |
|---|
| Value | Count | Frequency (%) | |
| 20190802 | 1000000 | 100.0% |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
EVENT_LABEL
Real number (ℝ≥0)
| Distinct count | 15 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 38.913653 |
|---|---|
| Minimum | 1 |
| Maximum | 300 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 25 |
| median | 46 |
| Q3 | 46 |
| 95-th percentile | 74 |
| Maximum | 300 |
| Range | 299 |
| Interquartile range (IQR) | 21 |
Descriptive statistics
| Standard deviation | 28.25856677 |
|---|---|
| Coefficient of variation (CV) | 0.7261864304 |
| Kurtosis | 7.065870671 |
| Mean | 38.913653 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.583467352 |
| Sum | 38913653 |
| Variance | 798.5465957 |
| Value | Count | Frequency (%) | |
| 46 | 659492 | 65.9% | |
| 1 | 189742 | 19.0% | |
| 4 | 46401 | 4.6% | |
| 25 | 35871 | 3.6% | |
| 74 | 33148 | 3.3% | |
| 139 | 17201 | 1.7% | |
| 178 | 4765 | 0.5% | |
| 179 | 4464 | 0.4% | |
| 68 | 4035 | 0.4% | |
| 133 | 2255 | 0.2% | |
| Other values (5) | 2626 | 0.3% |
| Value | Count | Frequency (%) | |
| 1 | 189742 | 19.0% | |
| 2 | 48 | < 0.1% | |
| 4 | 46401 | 4.6% | |
| 25 | 35871 | 3.6% | |
| 46 | 659492 | 65.9% |
| Value | Count | Frequency (%) | |
| 300 | 101 | < 0.1% | |
| 182 | 26 | < 0.1% | |
| 179 | 4464 | 0.4% | |
| 178 | 4765 | 0.5% | |
| 139 | 17201 | 1.7% |
| Distinct count | 1 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| 1000 |
|---|
| Value | Count | Frequency (%) | |
| 1000 | 1000000 | 100.0% |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
| Distinct count | 60 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1068.848268 |
|---|---|
| Minimum | 1005 |
| Maximum | 7039 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 1005 |
|---|---|
| 5-th percentile | 1020 |
| Q1 | 1021 |
| median | 1070 |
| Q3 | 1090 |
| 95-th percentile | 1090 |
| Maximum | 7039 |
| Range | 6034 |
| Interquartile range (IQR) | 69 |
Descriptive statistics
| Standard deviation | 220.0750486 |
|---|---|
| Coefficient of variation (CV) | 0.2058992424 |
| Kurtosis | 718.9617326 |
| Mean | 1068.848268 |
| Median Absolute Deviation (MAD) | 20 |
| Skewness | 26.63298874 |
| Sum | 1068848268 |
| Variance | 48433.02703 |
| Value | Count | Frequency (%) | |
| 1070 | 432283 | 43.2% | |
| 1090 | 248198 | 24.8% | |
| 1021 | 213433 | 21.3% | |
| 1020 | 77031 | 7.7% | |
| 1121 | 5303 | 0.5% | |
| 1092 | 4906 | 0.5% | |
| 1044 | 3498 | 0.3% | |
| 1084 | 2839 | 0.3% | |
| 1078 | 2597 | 0.3% | |
| 1045 | 1494 | 0.1% | |
| Other values (50) | 8418 | 0.8% |
| Value | Count | Frequency (%) | |
| 1005 | 196 | < 0.1% | |
| 1007 | 2 | < 0.1% | |
| 1010 | 9 | < 0.1% | |
| 1011 | 1 | < 0.1% | |
| 1012 | 18 | < 0.1% |
| Value | Count | Frequency (%) | |
| 7039 | 383 | < 0.1% | |
| 7038 | 8 | < 0.1% | |
| 7037 | 67 | < 0.1% | |
| 7036 | 14 | < 0.1% | |
| 7035 | 22 | < 0.1% |
MSISDN
Real number (ℝ≥0)
| Distinct count | 628527 |
|---|---|
| Unique (%) | 62.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 233328081205.61676 |
|---|---|
| Minimum | 233200000005 |
| Maximum | 233579999998 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 2.332e+11 |
|---|---|
| 5-th percentile | 2.332009404e+11 |
| Q1 | 2.332048586e+11 |
| median | 2.332089905e+11 |
| Q3 | 2.335038607e+11 |
| 95-th percentile | 2.335088371e+11 |
| Maximum | 2.3358e+11 |
| Range | 379999993 |
| Interquartile range (IQR) | 299002130 |
Descriptive statistics
| Standard deviation | 146712907.4 |
|---|---|
| Coefficient of variation (CV) | 0.0006287837564 |
| Kurtosis | -1.837075611 |
| Mean | 2.333280812e+11 |
| Median Absolute Deviation (MAD) | 7680246 |
| Skewness | 0.3845085067 |
| Sum | 2.333280812e+17 |
| Variance | 2.152467719e+16 |
| Value | Count | Frequency (%) | |
| 2.335006119e+11 | 39 | < 0.1% | |
| 2.332023045e+11 | 33 | < 0.1% | |
| 2.332082907e+11 | 31 | < 0.1% | |
| 2.332053568e+11 | 27 | < 0.1% | |
| 2.332468555e+11 | 27 | < 0.1% | |
| 2.33203757e+11 | 26 | < 0.1% | |
| 2.332776212e+11 | 25 | < 0.1% | |
| 2.335073757e+11 | 24 | < 0.1% | |
| 2.33202741e+11 | 24 | < 0.1% | |
| 2.33206317e+11 | 24 | < 0.1% | |
| Other values (628517) | 999720 | > 99.9% |
| Value | Count | Frequency (%) | |
| 2.332e+11 | 3 | < 0.1% | |
| 2.332e+11 | 1 | < 0.1% | |
| 2.332e+11 | 1 | < 0.1% | |
| 2.332000001e+11 | 4 | < 0.1% | |
| 2.332000001e+11 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 2.3358e+11 | 1 | < 0.1% | |
| 2.335799699e+11 | 2 | < 0.1% | |
| 2.335799193e+11 | 1 | < 0.1% | |
| 2.335799123e+11 | 2 | < 0.1% | |
| 2.335798663e+11 | 1 | < 0.1% |
| Distinct count | 1 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| 1 |
|---|
| Value | Count | Frequency (%) | |
| 1 | 1000000 | 100.0% |
DESTINATIONNETWORKID
Real number (ℝ≥0)
| Distinct count | 6 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.130557 |
|---|---|
| Minimum | 1 |
| Maximum | 7 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 2 |
| Maximum | 7 |
| Range | 6 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.4967233802 |
|---|---|
| Coefficient of variation (CV) | 0.4393616423 |
| Kurtosis | 60.6926955 |
| Mean | 1.130557 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 6.560204839 |
| Sum | 1130557 |
| Variance | 0.2467341165 |
| Value | Count | Frequency (%) | |
| 1 | 902630 | 90.3% | |
| 2 | 80177 | 8.0% | |
| 3 | 10097 | 1.0% | |
| 4 | 4008 | 0.4% | |
| 7 | 2722 | 0.3% | |
| 6 | 366 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 902630 | 90.3% | |
| 2 | 80177 | 8.0% | |
| 3 | 10097 | 1.0% | |
| 4 | 4008 | 0.4% | |
| 6 | 366 | < 0.1% |
| Value | Count | Frequency (%) | |
| 7 | 2722 | 0.3% | |
| 6 | 366 | < 0.1% | |
| 4 | 4008 | 0.4% | |
| 3 | 10097 | 1.0% | |
| 2 | 80177 | 8.0% |
ROAMING_FLAG
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| 0 | |
|---|---|
| 1 | 95 |
| Value | Count | Frequency (%) | |
| 0 | 999905 | > 99.9% | |
| 1 | 95 | < 0.1% |
BILLED_FLAG
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| N | |
|---|---|
| Y | 164401 |
| Value | Count | Frequency (%) | |
| N | 835599 | 83.6% | |
| Y | 164401 | 16.4% |
| Distinct count | 62 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 62.394336 |
|---|---|
| Minimum | 0 |
| Maximum | 233313 |
| Zeros | 972545 |
| Zeros (%) | 97.3% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 233313 |
| Range | 233313 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1026.78829 |
|---|---|
| Coefficient of variation (CV) | 16.45643428 |
| Kurtosis | 42609.3463 |
| Mean | 62.394336 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 188.3301685 |
| Sum | 62394336 |
| Variance | 1054294.192 |
| Value | Count | Frequency (%) | |
| 0 | 972545 | 97.3% | |
| 1991 | 9966 | 1.0% | |
| 1995 | 9177 | 0.9% | |
| 580 | 4834 | 0.5% | |
| 6111 | 1662 | 0.2% | |
| 7111 | 874 | 0.1% | |
| 134 | 164 | < 0.1% | |
| 2000 | 136 | < 0.1% | |
| 1906 | 106 | < 0.1% | |
| 570 | 78 | < 0.1% | |
| Other values (52) | 458 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 972545 | 97.3% | |
| 100 | 1 | < 0.1% | |
| 117 | 1 | < 0.1% | |
| 134 | 164 | < 0.1% | |
| 150 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 233313 | 16 | < 0.1% | |
| 18555 | 2 | < 0.1% | |
| 7111 | 874 | 0.1% | |
| 7007 | 1 | < 0.1% | |
| 6111 | 1662 | 0.2% |
| Distinct count | 7002 |
|---|---|
| Unique (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.09869795255 |
|---|---|
| Minimum | -5.0 |
| Maximum | 199.0 |
| Zeros | 832314 |
| Zeros (%) | 83.2% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | -5 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0.3115 |
| Maximum | 199 |
| Range | 204 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.8969460429 |
|---|---|
| Coefficient of variation (CV) | 9.087787738 |
| Kurtosis | 6086.053508 |
| Mean | 0.09869795255 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 47.05091688 |
| Sum | 98697.95255 |
| Variance | 0.8045122039 |
| Value | Count | Frequency (%) | |
| 0 | 832314 | 83.2% | |
| 0.15 | 19402 | 1.9% | |
| 0.3 | 6731 | 0.7% | |
| 0.4 | 4952 | 0.5% | |
| 0.2 | 3917 | 0.4% | |
| 2 | 3780 | 0.4% | |
| 0.6 | 3769 | 0.4% | |
| 0.25 | 3645 | 0.4% | |
| 0.1125 | 3111 | 0.3% | |
| 0.35 | 2773 | 0.3% | |
| Other values (6992) | 115606 | 11.6% |
| Value | Count | Frequency (%) | |
| -5 | 1 | < 0.1% | |
| 0 | 832314 | 83.2% | |
| 1e-05 | 2 | < 0.1% | |
| 2e-05 | 2 | < 0.1% | |
| 3e-05 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 199 | 2 | < 0.1% | |
| 100 | 3 | < 0.1% | |
| 95.6 | 1 | < 0.1% | |
| 58.35 | 1 | < 0.1% | |
| 50.05 | 2 | < 0.1% |
| Distinct count | 177 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.09275834200000001 |
|---|---|
| Minimum | 0.0 |
| Maximum | 300.0 |
| Zeros | 977344 |
| Zeros (%) | 97.7% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 300 |
| Range | 300 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.205215955 |
|---|---|
| Coefficient of variation (CV) | 12.99307349 |
| Kurtosis | 11103.3117 |
| Mean | 0.092758342 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 69.40334234 |
| Sum | 92758.342 |
| Variance | 1.452545498 |
| Value | Count | Frequency (%) | |
| 0 | 977344 | 97.7% | |
| 2 | 8681 | 0.9% | |
| 5 | 3152 | 0.3% | |
| 1 | 2718 | 0.3% | |
| 2.8 | 1264 | 0.1% | |
| 10 | 1037 | 0.1% | |
| 0.8 | 974 | 0.1% | |
| 1.8 | 905 | 0.1% | |
| 20 | 620 | 0.1% | |
| 4 | 463 | < 0.1% | |
| Other values (167) | 2842 | 0.3% |
| Value | Count | Frequency (%) | |
| 0 | 977344 | 97.7% | |
| 0.1 | 17 | < 0.1% | |
| 0.2 | 31 | < 0.1% | |
| 0.3 | 20 | < 0.1% | |
| 0.35 | 5 | < 0.1% |
| Value | Count | Frequency (%) | |
| 300 | 1 | < 0.1% | |
| 268.02 | 1 | < 0.1% | |
| 225 | 2 | < 0.1% | |
| 200 | 1 | < 0.1% | |
| 150 | 2 | < 0.1% |
| Distinct count | 52373 |
|---|---|
| Unique (%) | 5.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4583.99856 |
|---|---|
| Minimum | 0 |
| Maximum | 178262 |
| Zeros | 154746 |
| Zeros (%) | 15.5% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 29 |
| median | 476 |
| Q3 | 3721 |
| 95-th percentile | 24128 |
| Maximum | 178262 |
| Range | 178262 |
| Interquartile range (IQR) | 3692 |
Descriptive statistics
| Standard deviation | 10589.30818 |
|---|---|
| Coefficient of variation (CV) | 2.310059229 |
| Kurtosis | 22.00700643 |
| Mean | 4583.99856 |
| Median Absolute Deviation (MAD) | 476 |
| Skewness | 4.181342138 |
| Sum | 4583998560 |
| Variance | 112133447.7 |
| Value | Count | Frequency (%) | |
| 0 | 154746 | 15.5% | |
| 1 | 5523 | 0.6% | |
| 3 | 5243 | 0.5% | |
| 2 | 4948 | 0.5% | |
| 4 | 4863 | 0.5% | |
| 5 | 4282 | 0.4% | |
| 30 | 3669 | 0.4% | |
| 10 | 3543 | 0.4% | |
| 6 | 3495 | 0.3% | |
| 15 | 3486 | 0.3% | |
| Other values (52363) | 806202 | 80.6% |
| Value | Count | Frequency (%) | |
| 0 | 154746 | 15.5% | |
| 1 | 5523 | 0.6% | |
| 2 | 4948 | 0.5% | |
| 3 | 5243 | 0.5% | |
| 4 | 4863 | 0.5% |
| Value | Count | Frequency (%) | |
| 178262 | 1 | < 0.1% | |
| 176367 | 1 | < 0.1% | |
| 170144 | 1 | < 0.1% | |
| 159451 | 1 | < 0.1% | |
| 158034 | 1 | < 0.1% |
| Distinct count | 222 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.493332 |
|---|---|
| Minimum | 1 |
| Maximum | 10241 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 3 |
| Maximum | 10241 |
| Range | 10240 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 17.6903337 |
|---|---|
| Coefficient of variation (CV) | 11.84621618 |
| Kurtosis | 213072.4914 |
| Mean | 1.493332 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 428.6914845 |
| Sum | 1493332 |
| Variance | 312.9479065 |
| Value | Count | Frequency (%) | |
| 1 | 807911 | 80.8% | |
| 2 | 118788 | 11.9% | |
| 3 | 36008 | 3.6% | |
| 4 | 15440 | 1.5% | |
| 5 | 7530 | 0.8% | |
| 6 | 4334 | 0.4% | |
| 7 | 2484 | 0.2% | |
| 8 | 1629 | 0.2% | |
| 9 | 1132 | 0.1% | |
| 10 | 787 | 0.1% | |
| Other values (212) | 3957 | 0.4% |
| Value | Count | Frequency (%) | |
| 1 | 807911 | 80.8% | |
| 2 | 118788 | 11.9% | |
| 3 | 36008 | 3.6% | |
| 4 | 15440 | 1.5% | |
| 5 | 7530 | 0.8% |
| Value | Count | Frequency (%) | |
| 10241 | 1 | < 0.1% | |
| 8981 | 1 | < 0.1% | |
| 7143 | 1 | < 0.1% | |
| 4759 | 1 | < 0.1% | |
| 3063 | 1 | < 0.1% |
| Distinct count | 195 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -2093.455698 |
|---|---|
| Minimum | -30000000 |
| Maximum | 550500 |
| Zeros | 997234 |
| Zeros (%) | 99.7% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | -30000000 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 550500 |
| Range | 30550500 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 88311.6292 |
|---|---|
| Coefficient of variation (CV) | -42.18461814 |
| Kurtosis | 19338.68702 |
| Mean | -2093.455698 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -100.6923622 |
| Sum | -2093455698 |
| Variance | 7798943853 |
| Value | Count | Frequency (%) | |
| 0 | 997234 | 99.7% | |
| -100000 | 306 | < 0.1% | |
| -25600 | 295 | < 0.1% | |
| -200000 | 218 | < 0.1% | |
| -1048576 | 205 | < 0.1% | |
| -500000 | 158 | < 0.1% | |
| -81920 | 157 | < 0.1% | |
| -307200 | 140 | < 0.1% | |
| -5242880 | 118 | < 0.1% | |
| -460800 | 112 | < 0.1% | |
| Other values (185) | 1057 | 0.1% |
| Value | Count | Frequency (%) | |
| -30000000 | 1 | < 0.1% | |
| -15728640 | 2 | < 0.1% | |
| -13401000 | 1 | < 0.1% | |
| -12000000 | 1 | < 0.1% | |
| -11000000 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 550500 | 1 | < 0.1% | |
| 317600 | 1 | < 0.1% | |
| 288900 | 1 | < 0.1% | |
| 282950 | 1 | < 0.1% | |
| 264350 | 1 | < 0.1% |
| Distinct count | 380994 |
|---|---|
| Unique (%) | 38.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9121522.18075 |
|---|---|
| Minimum | 0 |
| Maximum | 10815727200 |
| Zeros | 467895 |
| Zeros (%) | 46.8% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 3091 |
| Q3 | 308130.75 |
| 95-th percentile | 32983308.9 |
| Maximum | 1.08157272e+10 |
| Range | 1.08157272e+10 |
| Interquartile range (IQR) | 308130.75 |
Descriptive statistics
| Standard deviation | 72578509.72 |
|---|---|
| Coefficient of variation (CV) | 7.956841882 |
| Kurtosis | 2946.746799 |
| Mean | 9121522.181 |
| Median Absolute Deviation (MAD) | 3091 |
| Skewness | 38.09800078 |
| Sum | 9.121522181e+12 |
| Variance | 5.267640073e+15 |
| Value | Count | Frequency (%) | |
| 0 | 467895 | 46.8% | |
| 140 | 1216 | 0.1% | |
| 129 | 664 | 0.1% | |
| 100 | 389 | < 0.1% | |
| 123 | 270 | < 0.1% | |
| 176 | 240 | < 0.1% | |
| 131 | 233 | < 0.1% | |
| 152 | 215 | < 0.1% | |
| 300 | 210 | < 0.1% | |
| 216 | 200 | < 0.1% | |
| Other values (380984) | 528468 | 52.8% |
| Value | Count | Frequency (%) | |
| 0 | 467895 | 46.8% | |
| 28 | 1 | < 0.1% | |
| 40 | 31 | < 0.1% | |
| 41 | 1 | < 0.1% | |
| 42 | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1.08157272e+10 | 1 | < 0.1% | |
| 9700207492 | 1 | < 0.1% | |
| 9238348366 | 1 | < 0.1% | |
| 8848640000 | 1 | < 0.1% | |
| 8052172800 | 1 | < 0.1% |
| Distinct count | 23657 |
|---|---|
| Unique (%) | 2.7% |
| Missing | 123307 |
| Missing (%) | 12.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 28846.459548553485 |
|---|---|
| Minimum | 1.0 |
| Maximum | 9835523.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 11238 |
| Q1 | 17660 |
| median | 26743 |
| Q3 | 36084 |
| 95-th percentile | 61661 |
| Maximum | 9835523 |
| Range | 9835522 |
| Interquartile range (IQR) | 18424 |
Descriptive statistics
| Standard deviation | 40425.06396 |
|---|---|
| Coefficient of variation (CV) | 1.401387366 |
| Kurtosis | 51352.17727 |
| Mean | 28846.45955 |
| Median Absolute Deviation (MAD) | 9249 |
| Skewness | 211.7311456 |
| Sum | 2.528948916e+10 |
| Variance | 1634185796 |
| Value | Count | Frequency (%) | |
| 65535 | 10380 | 1.0% | |
| 2872 | 864 | 0.1% | |
| 23498 | 829 | 0.1% | |
| 13498 | 809 | 0.1% | |
| 24219 | 592 | 0.1% | |
| 14219 | 565 | 0.1% | |
| 28640 | 512 | 0.1% | |
| 23499 | 508 | 0.1% | |
| 43498 | 466 | < 0.1% | |
| 13499 | 465 | < 0.1% | |
| Other values (23647) | 860703 | 86.1% | |
| (Missing) | 123307 | 12.3% |
| Value | Count | Frequency (%) | |
| 1 | 17 | < 0.1% | |
| 3 | 8 | < 0.1% | |
| 4 | 4 | < 0.1% | |
| 5 | 193 | < 0.1% | |
| 7 | 11 | < 0.1% |
| Value | Count | Frequency (%) | |
| 9835523 | 6 | < 0.1% | |
| 9835521 | 7 | < 0.1% | |
| 193382 | 1 | < 0.1% | |
| 193373 | 1 | < 0.1% | |
| 192992 | 1 | < 0.1% |
| Distinct count | 1741 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 14 |
| Missing (%) | < 0.1% |
| Memory size | 7.6 MiB |
| VOICE | |
|---|---|
| 500 | |
| 100 | |
| 1016 | |
| 1010 | |
| Other values (1736) |
| Value | Count | Frequency (%) | |
| VOICE | 170890 | 17.1% | |
| 500 | 140413 | 14.0% | |
| 100 | 138313 | 13.8% | |
| 1016 | 111466 | 11.1% | |
| 1010 | 102633 | 10.3% | |
| 1011 | 94000 | 9.4% | |
| SMS | 35871 | 3.6% | |
| 1018 | 35046 | 3.5% | |
| 1017 | 23561 | 2.4% | |
| Kirusa | 18900 | 1.9% | |
| Other values (1731) | 128893 | 12.9% |
Length
| Max length | 19 |
|---|---|
| Median length | 4 |
| Mean length | 4.873047 |
| Min length | 2 |
PROMOTION_CD
Categorical
| Distinct count | 11 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| -1 | |
|---|---|
| 0 | 58708 |
| 201 | 4247 |
| YDB | 418 |
| VCB | 73 |
| Other values (6) | 159 |
| Value | Count | Frequency (%) | |
| -1 | 936395 | 93.6% | |
| 0 | 58708 | 5.9% | |
| 201 | 4247 | 0.4% | |
| YDB | 418 | < 0.1% | |
| VCB | 73 | < 0.1% | |
| PWD | 67 | < 0.1% | |
| W4D | 49 | < 0.1% | |
| ETB | 40 | < 0.1% | |
| ARB | 1 | < 0.1% | |
| RFS | 1 | < 0.1% |
Length
| Max length | 3 |
|---|---|
| Median length | 2 |
| Mean length | 1.946189 |
| Min length | 1 |
PEAKID
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| 1 | |
|---|---|
| 2 | 89263 |
| 5 | 61677 |
| 3 | 50101 |
| Value | Count | Frequency (%) | |
| 1 | 798959 | 79.9% | |
| 2 | 89263 | 8.9% | |
| 5 | 61677 | 6.2% | |
| 3 | 50101 | 5.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
| Distinct count | 118 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.09741394199999996 |
|---|---|
| Minimum | 0.0 |
| Maximum | 300.0 |
| Zeros | 976275 |
| Zeros (%) | 97.6% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 300 |
| Range | 300 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.159820336 |
|---|---|
| Coefficient of variation (CV) | 11.90610207 |
| Kurtosis | 12783.31833 |
| Mean | 0.097413942 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 74.06165783 |
| Sum | 97413.942 |
| Variance | 1.345183212 |
| Value | Count | Frequency (%) | |
| 0 | 976275 | 97.6% | |
| 2 | 10268 | 1.0% | |
| 5 | 4598 | 0.5% | |
| 1 | 3349 | 0.3% | |
| 10 | 1536 | 0.2% | |
| 3 | 1324 | 0.1% | |
| 4 | 444 | < 0.1% | |
| 20 | 437 | < 0.1% | |
| 7 | 270 | < 0.1% | |
| 6 | 244 | < 0.1% | |
| Other values (108) | 1255 | 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 976275 | 97.6% | |
| 0.1 | 9 | < 0.1% | |
| 0.2 | 26 | < 0.1% | |
| 0.3 | 6 | < 0.1% | |
| 0.4 | 8 | < 0.1% |
| Value | Count | Frequency (%) | |
| 300 | 1 | < 0.1% | |
| 268.02 | 1 | < 0.1% | |
| 225 | 2 | < 0.1% | |
| 200 | 1 | < 0.1% | |
| 150 | 2 | < 0.1% |
IMSI
Real number (ℝ≥0)
| Distinct count | 625892 |
|---|---|
| Unique (%) | 62.9% |
| Missing | 5363 |
| Missing (%) | 0.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 620020522424545.0 |
|---|---|
| Minimum | 620020120000065.0 |
| Maximum | 620020549438651.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 6.2002012e+14 |
|---|---|
| 5-th percentile | 6.200205006e+14 |
| Q1 | 6.200205203e+14 |
| median | 6.20020531e+14 |
| Q3 | 6.200205406e+14 |
| 95-th percentile | 6.200205457e+14 |
| Maximum | 6.200205494e+14 |
| Range | 429438586 |
| Interquartile range (IQR) | 20296428 |
Descriptive statistics
| Standard deviation | 46480919 |
|---|---|
| Coefficient of variation (CV) | 7.496674274e-08 |
| Kurtosis | 43.49822939 |
| Mean | 6.200205224e+14 |
| Median Absolute Deviation (MAD) | 10508568 |
| Skewness | -6.019069216 |
| Sum | 6.166953524e+20 |
| Variance | 2.160475831e+15 |
| Value | Count | Frequency (%) | |
| 6.20020543e+14 | 39 | < 0.1% | |
| 6.200205238e+14 | 33 | < 0.1% | |
| 6.200205011e+14 | 31 | < 0.1% | |
| 6.20020536e+14 | 27 | < 0.1% | |
| 6.200205386e+14 | 27 | < 0.1% | |
| 6.200205299e+14 | 26 | < 0.1% | |
| 6.200205267e+14 | 25 | < 0.1% | |
| 6.200205306e+14 | 24 | < 0.1% | |
| 6.200204104e+14 | 24 | < 0.1% | |
| 6.200205235e+14 | 24 | < 0.1% | |
| Other values (625882) | 994357 | 99.4% | |
| (Missing) | 5363 | 0.5% |
| Value | Count | Frequency (%) | |
| 6.2002012e+14 | 4 | < 0.1% | |
| 6.2002012e+14 | 2 | < 0.1% | |
| 6.2002012e+14 | 1 | < 0.1% | |
| 6.2002012e+14 | 1 | < 0.1% | |
| 6.2002012e+14 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 6.200205494e+14 | 1 | < 0.1% | |
| 6.200205494e+14 | 1 | < 0.1% | |
| 6.200205494e+14 | 1 | < 0.1% | |
| 6.200205494e+14 | 1 | < 0.1% | |
| 6.200205494e+14 | 2 | < 0.1% |
| Distinct count | 1 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| 201908 |
|---|
| Value | Count | Frequency (%) | |
| 201908 | 1000000 | 100.0% |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
| Distinct count | 1 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| 1 |
|---|
| Value | Count | Frequency (%) | |
| 1 | 1000000 | 100.0% |
| Distinct count | 1 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| 03-AUG-19 |
|---|
| Value | Count | Frequency (%) | |
| 03-AUG-19 | 1000000 | 100.0% |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 9 |
| Min length | 9 |
CLASSIFICATION2
Categorical
| Distinct count | 22 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| GPRS | |
|---|---|
| VOICE | |
| SMS | 35871 |
| MT_REVENUE | 33148 |
| THIRD_PARTY_DEDUCTION | 20811 |
| Other values (17) | 61108 |
| Value | Count | Frequency (%) | |
| GPRS | 659272 | 65.9% | |
| VOICE | 189790 | 19.0% | |
| SMS | 35871 | 3.6% | |
| MT_REVENUE | 33148 | 3.3% | |
| THIRD_PARTY_DEDUCTION | 20811 | 2.1% | |
| BUNDLE_SUBSCRIPTION | 18046 | 1.8% | |
| USSD_RELOAD | 15884 | 1.6% | |
| SOS_TOPUP | 4765 | 0.5% | |
| SOS_PAYMENT | 4464 | 0.4% | |
| CRBT_REVENUE | 3935 | 0.4% | |
| Other values (12) | 14014 | 1.4% |
Length
| Max length | 21 |
|---|---|
| Median length | 4 |
| Mean length | 5.294834 |
| Min length | 3 |
| Distinct count | 62 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 108880 |
| Missing (%) | 10.9% |
| Memory size | 7.6 MiB |
| 233200005 | |
|---|---|
| 080.087.092.020 | |
| 080.087.092.022 | |
| 080.087.092.028 | |
| 080.087.092.030 | |
| Other values (57) |
| Value | Count | Frequency (%) | |
| 233200005 | 231532 | 23.2% | |
| 080.087.092.020 | 138251 | 13.8% | |
| 080.087.092.022 | 136793 | 13.7% | |
| 080.087.092.028 | 120836 | 12.1% | |
| 080.087.092.030 | 120444 | 12.0% | |
| 080.087.092.091 | 51947 | 5.2% | |
| 080.087.092.111 | 51272 | 5.1% | |
| 080.087.092.121 | 10033 | 1.0% | |
| 080.087.092.102 | 10028 | 1.0% | |
| 080.087.092.122 | 9943 | 1.0% | |
| Other values (52) | 10041 | 1.0% | |
| (Missing) | 108880 | 10.9% |
Length
| Max length | 32 |
|---|---|
| Median length | 32 |
| Mean length | 28.84248 |
| Min length | 3 |
| Distinct count | 23735 |
|---|---|
| Unique (%) | 2.4% |
| Missing | 883 |
| Missing (%) | 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 28491.94235710132 |
|---|---|
| Minimum | 1.0 |
| Maximum | 9835523.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 11332 |
| Q1 | 17496 |
| median | 26467 |
| Q3 | 35526 |
| 95-th percentile | 61317 |
| Maximum | 9835523 |
| Range | 9835522 |
| Interquartile range (IQR) | 18030 |
Descriptive statistics
| Standard deviation | 38065.96932 |
|---|---|
| Coefficient of variation (CV) | 1.336025773 |
| Kurtosis | 57320.55552 |
| Mean | 28491.94236 |
| Median Absolute Deviation (MAD) | 9038 |
| Skewness | 222.5416205 |
| Sum | 2.846678397e+10 |
| Variance | 1449018021 |
| Value | Count | Frequency (%) | |
| 65535 | 10380 | 1.0% | |
| 2872 | 866 | 0.1% | |
| 23498 | 831 | 0.1% | |
| 13498 | 812 | 0.1% | |
| 24219 | 600 | 0.1% | |
| 28640 | 590 | 0.1% | |
| 14219 | 577 | 0.1% | |
| 25356 | 514 | 0.1% | |
| 23499 | 508 | 0.1% | |
| 28605 | 474 | < 0.1% | |
| Other values (23725) | 982965 | 98.3% | |
| (Missing) | 883 | 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 17 | < 0.1% | |
| 3 | 8 | < 0.1% | |
| 4 | 4 | < 0.1% | |
| 5 | 193 | < 0.1% | |
| 7 | 11 | < 0.1% |
| Value | Count | Frequency (%) | |
| 9835523 | 6 | < 0.1% | |
| 9835521 | 7 | < 0.1% | |
| 193382 | 1 | < 0.1% | |
| 193373 | 1 | < 0.1% | |
| 192992 | 1 | < 0.1% |
| Distinct count | 25 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1125.929182 |
|---|---|
| Minimum | 1 |
| Maximum | 1126 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1126 |
| Q1 | 1126 |
| median | 1126 |
| Q3 | 1126 |
| 95-th percentile | 1126 |
| Maximum | 1126 |
| Range | 1125 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 8.089548829 |
|---|---|
| Coefficient of variation (CV) | 0.007184775879 |
| Kurtosis | 15293.33835 |
| Mean | 1125.929182 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -121.6382015 |
| Sum | 1125929182 |
| Variance | 65.44080025 |
| Value | Count | Frequency (%) | |
| 1126 | 999904 | > 99.9% | |
| 460 | 14 | < 0.1% | |
| 1034 | 13 | < 0.1% | |
| 40 | 9 | < 0.1% | |
| 426 | 9 | < 0.1% | |
| 41 | 8 | < 0.1% | |
| 415 | 6 | < 0.1% | |
| 17 | 5 | < 0.1% | |
| 1005 | 4 | < 0.1% | |
| 34 | 4 | < 0.1% | |
| Other values (15) | 24 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 1 | < 0.1% | |
| 14 | 1 | < 0.1% | |
| 15 | 3 | < 0.1% | |
| 17 | 5 | < 0.1% | |
| 24 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1126 | 999904 | > 99.9% | |
| 1034 | 13 | < 0.1% | |
| 1006 | 1 | < 0.1% | |
| 1005 | 4 | < 0.1% | |
| 992 | 2 | < 0.1% |
| Distinct count | 154 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 224.897212 |
|---|---|
| Minimum | -1 |
| Maximum | 1126 |
| Zeros | 750311 |
| Zeros (%) | 75.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1126 |
| Maximum | 1126 |
| Range | 1127 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 447.7535836 |
|---|---|
| Coefficient of variation (CV) | 1.990925453 |
| Kurtosis | 0.2774131595 |
| Mean | 224.897212 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.507474895 |
| Sum | 224897212 |
| Variance | 200483.2717 |
| Value | Count | Frequency (%) | |
| 0 | 750311 | 75.0% | |
| 1126 | 192653 | 19.3% | |
| -1 | 24077 | 2.4% | |
| 40 | 20106 | 2.0% | |
| 109 | 4977 | 0.5% | |
| 995 | 2159 | 0.2% | |
| 1011 | 1662 | 0.2% | |
| 1103 | 886 | 0.1% | |
| 1094 | 528 | 0.1% | |
| 18 | 479 | < 0.1% | |
| Other values (144) | 2162 | 0.2% |
| Value | Count | Frequency (%) | |
| -1 | 24077 | 2.4% | |
| 0 | 750311 | 75.0% | |
| 4 | 3 | < 0.1% | |
| 12 | 43 | < 0.1% | |
| 14 | 38 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1126 | 192653 | 19.3% | |
| 1124 | 1 | < 0.1% | |
| 1115 | 2 | < 0.1% | |
| 1113 | 1 | < 0.1% | |
| 1112 | 1 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| EVENTDATE | EVENT_LABEL | PROVIDER | COSID | MSISDN | ORIGINATIONNETWORKID | DESTINATIONNETWORKID | ROAMING_FLAG | BILLED_FLAG | SHORTCODEID | TOT_CHARGED_AMT | BALANCERELOAD | TOT_ACTUAL_DURATION | NO_OF_EVENTS | BONUS | TOT_ROUNDED_VOL | CELL_ID | CLASSIFICATION | PROMOTION_CD | PEAKID | RELOADFACEVALUE | IMSI | MONTHID | LOAD_ID | LOADDATE | CLASSIFICATION2 | ROAMING_DETAILS | HOME_CELLID | ORIGINATING_COUNTRY_ID | DESTINATION_COUNTRY_ID | PS_TYPE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 20190802 | 46 | 1000 | 1070 | 233204922078 | 1 | 1 | 0 | N | 0 | 0.00 | 0.0 | 13411 | 1 | 0 | 5544297 | 54358.0 | 100 | -1 | 2 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | GPRS | 080.087.092.020 | 54358.0 | 1126 | 0 | 1.0 |
| 1 | 20190802 | 46 | 1000 | 1021 | 233203520024 | 1 | 1 | 0 | N | 0 | 0.00 | 0.0 | 8 | 3 | 0 | 0 | 65535.0 | 100 | -1 | 1 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | GPRS | 080.087.092.111 | 65535.0 | 1126 | 0 | 1.0 |
| 2 | 20190802 | 139 | 1000 | 1090 | 233500157746 | 1 | 1 | 0 | Y | 0 | 17.99 | 0.0 | 0 | 1 | 0 | 0 | NaN | BDLYOUTH1MTLY | -1 | 1 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | BUNDLE_SUBSCRIPTION | NaN | 29776.0 | 1126 | 0 | NaN |
| 3 | 20190802 | 139 | 1000 | 1070 | 233204426836 | 1 | 1 | 0 | Y | 0 | 5.00 | 0.0 | 0 | 1 | 0 | 0 | NaN | DATABUNDDR5WLNR | -1 | 1 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | BUNDLE_SUBSCRIPTION | NaN | 34185.0 | 1126 | 0 | NaN |
| 4 | 20190802 | 4 | 1000 | 1020 | 233202935215 | 1 | 1 | 0 | Y | 0 | 0.10 | 0.0 | 0 | 1 | 0 | 0 | NaN | TONERNW@crbtuser | 0 | 1 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | CRBT_REVENUE | NaN | 16630.0 | 1126 | 0 | NaN |
| 5 | 20190802 | 4 | 1000 | 1021 | 233206498686 | 1 | 1 | 0 | Y | 0 | 0.03 | 0.0 | 0 | 1 | 0 | 0 | NaN | SUB@crbtuser | 0 | 3 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | CRBT_REVENUE | NaN | 15612.0 | 1126 | 0 | NaN |
| 6 | 20190802 | 4 | 1000 | 1090 | 233502959393 | 1 | 1 | 0 | Y | 0 | 0.10 | 0.0 | 0 | 1 | 0 | 0 | NaN | STATUS@crbtuser | 0 | 5 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | CRBT_REVENUE | NaN | 37710.0 | 1126 | 0 | NaN |
| 7 | 20190802 | 4 | 1000 | 1021 | 233202471084 | 1 | 1 | 0 | Y | 0 | 0.15 | 0.0 | 0 | 1 | 0 | 0 | NaN | SUB@crbtuser2 | 0 | 1 | 0.0 | 6.200204e+14 | 201908 | 1 | 03-AUG-19 | CRBT_REVENUE | NaN | 24073.0 | 1126 | 0 | NaN |
| 8 | 20190802 | 46 | 1000 | 1090 | 233243585839 | 1 | 1 | 0 | N | 0 | 0.00 | 0.0 | 408 | 1 | 0 | 102921 | 65535.0 | 1017 | -1 | 1 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | GPRS | 080.087.092.091 | 65535.0 | 1126 | 0 | 1.0 |
| 9 | 20190802 | 46 | 1000 | 1090 | 233500481791 | 1 | 1 | 0 | N | 0 | 0.00 | 0.0 | 735 | 1 | 0 | 5128 | 39017.0 | 500 | -1 | 5 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | GPRS | 080.087.092.028 | 39017.0 | 1126 | 0 | 1.0 |
Last rows
| EVENTDATE | EVENT_LABEL | PROVIDER | COSID | MSISDN | ORIGINATIONNETWORKID | DESTINATIONNETWORKID | ROAMING_FLAG | BILLED_FLAG | SHORTCODEID | TOT_CHARGED_AMT | BALANCERELOAD | TOT_ACTUAL_DURATION | NO_OF_EVENTS | BONUS | TOT_ROUNDED_VOL | CELL_ID | CLASSIFICATION | PROMOTION_CD | PEAKID | RELOADFACEVALUE | IMSI | MONTHID | LOAD_ID | LOADDATE | CLASSIFICATION2 | ROAMING_DETAILS | HOME_CELLID | ORIGINATING_COUNTRY_ID | DESTINATION_COUNTRY_ID | PS_TYPE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 999990 | 20190802 | 46 | 1000 | 1090 | 233509088971 | 1 | 1 | 0 | N | 0 | 0.00000 | 0.0 | 369 | 1 | 0 | 5890558 | 23330.0 | 100 | -1 | 1 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | GPRS | 080.087.092.030 | 23330.0 | 1126 | 0 | 0.0 |
| 999991 | 20190802 | 46 | 1000 | 1070 | 233501740319 | 1 | 1 | 0 | N | 0 | 0.00000 | 0.0 | 42 | 1 | 0 | 8904 | 31239.0 | 100 | -1 | 1 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | GPRS | 080.087.092.028 | 31239.0 | 1126 | 0 | 0.0 |
| 999992 | 20190802 | 46 | 1000 | 1090 | 233503791094 | 1 | 1 | 0 | N | 0 | 0.00000 | 0.0 | 5216 | 7 | 0 | 31787670 | 38762.0 | 100 | -1 | 1 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | GPRS | 080.087.092.028 | 38762.0 | 1126 | 0 | 1.0 |
| 999993 | 20190802 | 46 | 1000 | 1021 | 233209494500 | 1 | 1 | 0 | N | 0 | 0.00000 | 0.0 | 18 | 1 | 0 | 9702 | 14098.0 | 1016 | -1 | 1 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | GPRS | 080.087.092.020 | 14098.0 | 1126 | 0 | 0.0 |
| 999994 | 20190802 | 46 | 1000 | 1070 | 233247981441 | 1 | 1 | 0 | N | 0 | 0.00000 | 0.0 | 408 | 1 | 0 | 0 | 31099.0 | 1010 | -1 | 1 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | GPRS | 080.087.092.020 | 31099.0 | 1126 | 0 | 0.0 |
| 999995 | 20190802 | 46 | 1000 | 1090 | 233208673711 | 1 | 1 | 0 | N | 0 | 0.00000 | 0.0 | 21238 | 2 | 0 | 44722552 | 5113.0 | 1016 | -1 | 1 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | GPRS | 080.087.092.101 | 5113.0 | 1126 | 0 | 6.0 |
| 999996 | 20190802 | 46 | 1000 | 1021 | 233508378181 | 1 | 1 | 0 | N | 0 | 0.00000 | 0.0 | 492 | 1 | 0 | 267823 | 14581.0 | 1018 | -1 | 5 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | GPRS | 080.087.092.022 | 14581.0 | 1126 | 0 | 1.0 |
| 999997 | 20190802 | 46 | 1000 | 1070 | 233205754189 | 1 | 1 | 0 | N | 0 | 0.00000 | 0.0 | 158 | 1 | 0 | 428120 | 13905.0 | 100 | -1 | 1 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | GPRS | 080.087.092.020 | 13905.0 | 1126 | 0 | 1.0 |
| 999998 | 20190802 | 46 | 1000 | 1090 | 233200737695 | 1 | 1 | 0 | N | 0 | 0.00000 | 0.0 | 20293 | 1 | 0 | 182839 | 13641.0 | 500 | -1 | 1 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | GPRS | 080.087.092.091 | 13641.0 | 1126 | 0 | 1.0 |
| 999999 | 20190802 | 1 | 1000 | 1070 | 233507696654 | 1 | 2 | 0 | Y | 0 | 0.55733 | 0.0 | 209 | 3 | 0 | 0 | 63908.0 | VOICE | -1 | 1 | 0.0 | 6.200205e+14 | 201908 | 1 | 03-AUG-19 | VOICE | 233200005 | 63908.0 | 1126 | 1126 | NaN |